SCA: recovering single-cell heterogeneity through information-based dimensionality reduction

Genome Biol. 2023 Aug 25;24(1):195. doi: 10.1186/s13059-023-02998-7.

Abstract

Dimensionality reduction summarizes the complex transcriptomic landscape of single-cell datasets for downstream analyses. Current approaches favor large cellular populations defined by many genes, at the expense of smaller and more subtly defined populations. Here, we present surprisal component analysis (SCA), a technique that newly leverages the information-theoretic notion of surprisal for dimensionality reduction to promote more meaningful signal extraction. For example, SCA uncovers clinically important cytotoxic T-cell subpopulations that are indistinguishable using existing pipelines. We also demonstrate that SCA substantially improves downstream imputation. SCA's efficient information-theoretic paradigm has broad applications to the study of complex biological tissues in health and disease.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Gene Expression Profiling*
  • Transcriptome*